201 research outputs found
Model Transfer for Tagging Low-resource Languages using a Bilingual Dictionary
Cross-lingual model transfer is a compelling and popular method for
predicting annotations in a low-resource language, whereby parallel corpora
provide a bridge to a high-resource language and its associated annotated
corpora. However, parallel data is not readily available for many languages,
limiting the applicability of these approaches. We address these drawbacks in
our framework which takes advantage of cross-lingual word embeddings trained
solely on a high coverage bilingual dictionary. We propose a novel neural
network model for joint training from both sources of data based on
cross-lingual word embeddings, and show substantial empirical improvements over
baseline techniques. We also propose several active learning heuristics, which
result in improvements over competitive benchmark methods.Comment: 5 pages with 2 pages reference. Accepted to appear in ACL 201
Graph-to-Sequence Learning using Gated Graph Neural Networks
Many NLP applications can be framed as a graph-to-sequence learning problem.
Previous work proposing neural architectures on this setting obtained promising
results compared to grammar-based approaches but still rely on linearisation
heuristics and/or standard recurrent networks to achieve the best performance.
In this work, we propose a new model that encodes the full structural
information contained in the graph. Our architecture couples the recently
proposed Gated Graph Neural Networks with an input transformation that allows
nodes and edges to have their own hidden representations, while tackling the
parameter explosion problem present in previous work. Experimental results show
that our model outperforms strong baselines in generation from AMR graphs and
syntax-based neural machine translation.Comment: ACL 201
Classifying Tweet Level Judgements of Rumours in Social Media
Social media is a rich source of rumours and corresponding community
reactions. Rumours reflect different characteristics, some shared and some
individual. We formulate the problem of classifying tweet level judgements of
rumours as a supervised learning task. Both supervised and unsupervised domain
adaptation are considered, in which tweets from a rumour are classified on the
basis of other annotated rumours. We demonstrate how multi-task learning helps
achieve good results on rumours from the 2011 England riots
Learning how to Active Learn: A Deep Reinforcement Learning Approach
Active learning aims to select a small subset of data for annotation such
that a classifier learned on the data is highly accurate. This is usually done
using heuristic selection methods, however the effectiveness of such methods is
limited and moreover, the performance of heuristics varies between datasets. To
address these shortcomings, we introduce a novel formulation by reframing the
active learning as a reinforcement learning problem and explicitly learning a
data selection policy, where the policy takes the role of the active learning
heuristic. Importantly, our method allows the selection policy learned using
simulation on one language to be transferred to other languages. We demonstrate
our method using cross-lingual named entity recognition, observing uniform
improvements over traditional active learning.Comment: To appear in EMNLP 201
Exploring Prediction Uncertainty in Machine Translation Quality Estimation
Machine Translation Quality Estimation is a notoriously difficult task, which
lessens its usefulness in real-world translation environments. Such scenarios
can be improved if quality predictions are accompanied by a measure of
uncertainty. However, models in this task are traditionally evaluated only in
terms of point estimate metrics, which do not take prediction uncertainty into
account. We investigate probabilistic methods for Quality Estimation that can
provide well-calibrated uncertainty estimates and evaluate them in terms of
their full posterior predictive distributions. We also show how this posterior
information can be useful in an asymmetric risk scenario, which aims to capture
typical situations in translation workflows.Comment: Proceedings of CoNLL 201
- …